GMM based speaker identification using training-time-dependent number of mixtures
نویسندگان
چکیده
In this paper, we present the study of the performance of our standard GMM speaker identi cation system in \a limited amount of training data" context. We explore the use of di erent mixture components for di erent speakers/models. Di erent approaches are presented: (a) a nonlinear transformation of speech duration vs. number of mixtures is proposed in order to set correctly the appropriate number of model mixtures for each speaker according to the available training data. (b) From exhaustive experiments, the appropriate linear transformation is deduced. The resulting transformation o ers several advantages: (a) each speaker is well modelized (b) the performance is improved by more than 6% on the SPIDRE corpus and nally (c) the number of mixtures is reduced and thus leads to a faster system response.
منابع مشابه
The Estimating Optimal Number of Gaussian Mixtures Based on Incremental k-means for Speaker Identification
Gaussian mixture model (GMM) is generally used to estimate the speaker model from speech for speaker identification. In this paper, we propose the method that estimates the optimal number of Gaussian mixtures based on incremental k-means for speaker identification. In the proposed method, the initialization with the optimal number of mixtures is done by adding dynamically the number of mixtures...
متن کاملModeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition
The Gaussian mixture model-universal background model (GMM-UBM) has been dominant in text-independent speaker recognition tasks. However the conventional GMM-UBM method assumes that each Gaussian mixture is independent and ignores the fact that within Gaussian mixtures, there do exist some useful high-level speaker-dependent characteristics, such as word usage or speaking habits. Based on the G...
متن کاملMinimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution
In our previous work, we have proposed a speaker modeling technique using spectral and pitch features for text-independent speaker identification based on Multi-Space Probability Distribution Gaussian Mixture Models (MSD-GMMs). We have presented a maximum likelihood (ML) estimation procedure for the MSD-GMM parameters and demonstrated its high recognition performance. In this paper, we describe...
متن کاملA discriminative training algorithm for Gaussian mixture speaker models
The Gaussian mixture speaker model (GMM) is usually trained with the expectation-maximization (EM) algorithm to maximize the likelihood (ML) of observation data from an individual class. The GMM trained based the ML criterion has weak discriminative power when used as a classifier. In this paper, a discriminative training procedure is proposed to fine-tune the parameters in the GMMs. The goal o...
متن کاملA Discrimative Training Algorithm for Gaussian Mixture Speaker Models
The Gaussian mixture speaker model (GMM) is usually trained with the expectation-maximization (EM) algorithm to maximize the likelihood (ML) of observation data from an individual class. The GMM trained based the ML criterion has weak discriminative power when used as a classifier. In this paper, a discriminative training procedure is proposed to fine-tune the parameters in the GMMs. The goal o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998